A hybrid ensemble feature selection design for candidate biomarkers discovery from transcriptome profiles
نویسندگان
چکیده
Discovering disease biomarkers from gene expression data has been greatly advanced by feature selection (FS) methods, especially using ensemble FS (EFS) strategies with perturbation at the ( i.e., homogeneous EFS) or method level heterogeneous EFS). Here, we proposed a hybrid EFS design that explores both types of to disrupt associations good performance single dataset, algorithm, specific combination both, which is particularly interesting for better reproducibility genomic biomarkers. We investigated adequacy our approach microarray in four cancer, extensively comparing it other and approaches. Five algorithms were analyzed: Wx, Symmetrical Uncertainty, Gain Ratio, Characteristic Direction, ReliefF. observed that, across distinct datasets, approaches attenuated large variation most methods without function perturbation. Additionally, superior EFS. Interestingly, ranks produced reached greater biological plausibility, notably high enrichment cancer-related genes pathways. Thus, experiments suggest potential discovering candidate data. Finally, provide an open-source framework support similar analyses this domains, being available as user-friendly application programmable Python package.
منابع مشابه
Hybrid Ensemble Gene Selection Algorithm for Identifying Biomarkers from Breast Cancer Gene Expression Profiles
Breast cancer is one of the major health hazard in the world. DNA gene expression profiles plays an important role in identifying the biomarkers for cancer which not only help in accurate diagnosis of the disease, also in discovering drugs, minimizing the toxicity thus help in the effective management of the disease. In this paper we propose an algorithm for determining the biomarkers. Our hybr...
متن کاملA Classification Method for E-mail Spam Using a Hybrid Approach for Feature Selection Optimization
Spam is an unwanted email that is harmful to communications around the world. Spam leads to a growing problem in a personal email, so it would be essential to detect it. Machine learning is very useful to solve this problem as it shows good results in order to learn all the requisite patterns for classification due to its adaptive existence. Nonetheless, in spam detection, there are a large num...
متن کاملHybrid Correlation and Causal Feature Selection for Ensemble Classifiers
PC and TPDA algorithms are robust and well known prototype algorithms, incorporating constraint-based approaches for causal discovery. However, both algorithms cannot scale up to deal with high dimensional data, that is more than few hundred features. This chapter presents hybrid correlation and causal feature selection for ensemble classifiers to deal with this problem. Redundant features are ...
متن کاملEnsemble Classification and Extended Feature Selection for Credit Card Fraud Detection
Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...
متن کاملMLIFT: Enhancing Multi-label Classifier with Ensemble Feature Selection
Multi-label classification has gained significant attention during recent years, due to the increasing number of modern applications associated with multi-label data. Despite its short life, different approaches have been presented to solve the task of multi-label classification. LIFT is a multi-label classifier which utilizes a new strategy to multi-label learning by leveraging label-specific ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Knowledge Based Systems
سال: 2022
ISSN: ['1872-7409', '0950-7051']
DOI: https://doi.org/10.1016/j.knosys.2022.109655